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ERROR-RESISTANT  NARROWBAND  VOICE  ENCODER 


INTRODUCTION 

Tactical  voice  communications  are  often  brief,  but  the  success  of  the  mission  and  even  the  lives 
of  the  personnel  are  often  dependent  on  the  reliable  transmission  of  a  few  critical  messages.  The  linear 
predictive  coder  (LPC)  operating  at  2400  bits  per  second  (b/s)  will  be  widely  deployed  to  support  tacti¬ 
cal  voice  communication  over  narrowband  channels.  It  provides  good  speech  intelligibility  in  error-free 
conditions;  however  the  speech  quality  degrades  rather  quickly  in  the  presence  of  bit  errors.  As  indi¬ 
cated  in  Fig.  1,  intelligibility  of  LPC-processed  speech  is  poor  at  a  bit-error  rate  of  3%.  (Unless  other¬ 
wise  stated,  the  2400-b/s  LPC  referred  to  is  the  Government  standard  2400-b/s  LPC  defined  by 
Federal  Standard  1015  [1]). 


Random  Bit-Error  (Percent) 

Fig  1  —  Speech  intelligibility  of  the  2400-b/s  LPC  in  terms 
of  random  bit-error  rate  As  noted,  speech  quality 
becomes  poor  if  the  bit-error  rate  exceeds  3%  (The 
descriptors  "good,"  "fair,"  etc  have  been  recently  adopted 
by  the  Dot)  Digital  Voice  Processor  Consortium  )  The 
lack  of  robustness  is  partly  caused  by  the  fact  that  an  error 
in  any  one  of  the  LPC  coefficients  alters  the  speech  spec¬ 
trum  over  the  entire  passband  It  is  interesting  to  note  that 
earlier  channel  vocoders  were  more  error -resistant  because 
an  error  in  one  channel  output  affected  the  spe  h  spec¬ 
trum  only  in  one  narrow  frequency  band 
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Thus  the  availability  of  a  more  robust  voice  terminal  seems  highly  desirable.  There  are,  however, 
two  problems  in  providing  an  improved  capability  to  narrowband  tactical  communicators. 

•  The  data  rate  must  be  low  enough  to  permit  transmission  over  narrowband  channels  that 
have  a  bandwidth  of  approximately  3  kHz.  Most  tactical  communicators  cannot  use  wide¬ 
band  systems  because  they  do  not  have  access  to  wideband  channels. 

•  An  improved  voice  processor  in  a  separate  package  will  not  help  most  tactical  communica¬ 
tors  because  their  platforms  are  too  congested  to  carry  more  than  one  voice  terminal  (e  g  , 
amphibious  vehicles,  high-performance  aircraft,  armored  personnel  carriers,  jeeps,  or  tanks). 
Certainly  tactical  radio  operators  operating  on  foot  (Fig.  2)  must  rely  on  a  single  voice  termi¬ 
nal,  and  it  must  be  the  2400-b/s  LPC  because  it  is  the  only  narrowband  voice  processor  that 
has  been  standardized  for  interoperability. 

The  most  practical  and  cost-effective  way  of  providing  tactical  communicators  with  improved  capa¬ 
bility  is  to  integrate  the  improved  voice  processing  and  modem  software  into  an  existing  2400-b/s  LPC 
terminal,  such  as  the  Advanced  Narrowband  Digital  Voice  Terminal  (ANDVT)  Narrowband  users 
would  then  have  both  the  2400-b/s  LPC  and  the  improved  voice  processor  without  requiring  a  new 
radio  transmitter,  antenna,  central  processing  unit  (CPU),  packaging,  communication  security  (COM¬ 
SEC)  unit,  etc.  The  operator  may  manually  select  one  of  the  two  modes.  The  transmitter  and  receiver 
may  alternately  probe  the  channel  during  the  preamble  period,  and  the  transmitter  may  select  automati¬ 
cally  a  preferred  mode  based  on  the  channel  conditions.  The  resulting  voice  terminal  is  an  example  of 
an  expert  system  As  technology  advances,  the  voice  terminal  could  have  more  elaborate  voice  process¬ 
ing  and  error  protection  algorithms. 

To  increase  the  robustness  of  the  narrowband  voice  processor  under  conditions  of  channel  bit 
errors,  we  have  encoded  the  speech  at  a  low  rate  (i.e.,  800  b/s)  and  let  coding  and  modulation  bring  up 
the  data  rate  so  that  it  is  compatible  with  transmission  over  the  narrowband  channel.  Note  that  protec¬ 
tion  of  voice  information  need  not  be  as  sophisticated  as  protection  of  digital  data  because  speech  has 
many  redundancies  and  the  powerful  human  brain  deciphers  the  information.  According  to  extensive 
test  data  we  have  collected  from  various  voice  processors,  LPC-processed  speech  is  intelligible  even 
under  1  or  2%  errors.  Actually,  tactical  communication  can  function  ander  even  worse  error  conditions 
because: 

•  tactical  communicators  use  a  limited  and  highly  specialized  vocabulary  consisting  of  words 
and  phrases  that  are  designed  to  be  easily  distinguished  in  poor  signal-to-noise  conditions; 

•  the  type  of  information  that  is  likely  to  be  communicated  is  also  highly  dependent  on  the 
nature  of  the  mission  and  the  stage  in  the  sequence  of  the  mission  so  that  the  communica¬ 
tors  know  what  to  expect; 

•  tactical  communicators  are  accustomed  to  poor-quality  speech,  and  can  accept  a  less-than- 
ideal  voice  terminal  if  there  is  no  other  way  to  communicate. 

Hence,  even  a  slightly  improved  voice  processor  would  be  a  help  to  tactical  communicators. 

The  implementation  of  a  robust  narrowband  voice  terminal  presents  many  technical  obstacles. 
The  most  difficult  problem  is  to  devise  a  low  rate  voice  processor  capable  of  providing  highly  intelligi¬ 
ble  speech  We  have  been  working  in  this  area  for  nearly  a  decade,  and  only  recently  we  have  suc¬ 
ceeded  in  implementing  what  appears  to  be  a  satisfactory  800-b/s  voice  processor  [2,3|.  In  terms  of  the 
Diagnostic  Rhyme  Test  (DRT),  the  intelligibility  of  the  800-b/s  voice  processor  is  only  1.4  points 
below  that  of  the  2400-b/s  LPC  We  desciibe  this  voice  processor  in  this  report. 


2 


//£■/; v  y.v.y.y.y.v ; y 


NRL  REPORT  9018 


From  pg  41  of  “The  Governmeni  Standard 
Linear  Predictive  Coding  Algorithm  LPC 
10.“  T  E  T  re  main.  Speech  Technology.  April 
1982.  Vol  |.  No  2,  Copyright  1982.  used  by 


Fig  2  —  A  tactical  radio  communicator  operating  on  foot  Similar  to  tacti¬ 
cal  radio  communicators  operating  in  amphibious  vehicles,  armored  per¬ 
sonnel  carriers,  iceps  and  tanks,  he  cannot  carry  more  than  one  voice  ter¬ 
minal  The  purpose  of  this  report  is  to  describe  an  improved  voice  pro¬ 
cessing  and  modem  software  that  can  be  incorporated  into  the  2400-b/s 
LPC  so  that  the  operator  can  choose  either  the  2400-b/s  LPC  or  a  more 
error-resistant  narrowband  voice  mode 
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frames  only  [1],  Since  the  unvoiced  speech  spectrum  does  not  have  predominant  resonant  frequencies, 
only  four  LPC  coefficients  are  transmitted  for  each  unvoiced  frame.  Thus  the  21  bits  used  to  encode 
the  fifth  through  tenth  LPC  coefficients  are  freed.  By  using  these  21  bits,  the  four  most  significant  bits 
(MSBs)  of  the  amplitude  parameters  and  the  first  four  reflection  coefficients  are  protected  (Table  1). 
Because  silence  is  transmitted  as  unvoiced  frames,  the  most  apparent  benefit  of  this  particular  error 
protection  is  a  reduction  of  loud  "pops"  during  silence  periods.  This  is  because  the  amplitude  parameter 
that  controls  the  loudness  of  the  synthesized  speech  will  have  fewer  errors. 


Table  1  —  Error-Protected  Unvoiced  Speech  Data  of  the 
2400-b/s  LPC.  The  bits  indicated  by  shaded  blocks  are  pro¬ 
tected  by  an  (8,4)  Hamming  code.  The  seven-bit 
pitch/voicing  parameter  has  some  redundancies  because 
there  are  only  61  possible  pitch  values.  Thus  a  one-bit  error 
can  be  decoded  correctly  in  seven  zeros  that  indicate  the 
unvoiced  state. 
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The  previously  mentioned  ANDVT  employs  more  powerful  error  protection  in  the  high-frequency 
(HF)  modem  [8,9],  Among  the  54  bits  of  voice  data  from  each  frame  (a  frame  rate  of  44.44  Hz),  the 
perceptually  most  significant  24  bits  (Table  2)  are  error-protected  by  a  Golay  (24,12)  code.  A  total  of 
78  bits  is  modulated  on  39  tones,  each  separated  by  56.25  Hz.  The  transmission  rate  is  thus  44.444 
frames/s  with  78  bits/frame  for  a  total  of  3466.67  b/s.  This  additional  1066.67  b/s  over  2400  b/s 
improves  the  performance  significantly  as  shown  in  Fig.  16b. 

Previous  Efforts  on  Very  Low  Data  Rate  Voice  Encoding 

For  many  years  we  have  been  investigating  voice  encoders  operating  at  data  rates  between  600 
and  800  b/s  (Table  3).  Since  this  is  approximately  1%  of  the  data  rate  of  unprocessed  digitized  speech, 
some  degradation  of  speech  intelligibility  is  inevitable.  Only  recently  we  have  been  able  to  devise  a 
voice  processor  capable  of  generating  high-quality  speech  at  800  b/s.  The  Diagnostic  Rhyme  Test 
(DRT)  for  three  male  speakers  over  the  800-b/s  system  is  87.0.  Currently,  this  is  the  highest  score 
attained  by  any  voice  processor  operating  at  a  fixed  data  rate  of  800  b/s.  The  most  striking  difference 
between  this  voice  processor  and  others  is  the  use  of  new  speech  parameters  called  line  spectrum  pairs 
(LSPs).  We  discuss  the  various  aspects  of  the  LSPs  from  pages  7  through  17. 
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Table  2  —  Speech  Data  Protected  by  the 
ANDVT  Modem.  The  first  four  reflection 
coefficients  are  more  critical  to  the  speech  spec¬ 
trum  than  the  remaining  reflection  coefficients. 
Likewise  a  pitch  error  is  readily  perceived  by 
the  listener.  Hence,  MSBs  of  these  parameters, 
indicated  by  shaded  blocks,  are  protected. 
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Table  3  —  Our  Previous  Efforts  on  Low  Data  Rate 
Voice  Processor  Development 


Year 

Effort 

Parameters 

Real 

Time 

Data 

(b/s) 

DRT 

Ref. 

1976 

In-house 

Formant 

Frequencies 

No 

600 

79.9  (1M>* 

1 1,12 

1980 

Contract 

Reflection 

Coefficients 

No 

800 

80.0  (2M) 

13 

1981 

Contract 

Reflection 

Coefficients 

Yes 

800 

78.3  (3M) 

14 

1983 

In-house 

Reflection 

Coefficients 

Ves 

800 

82.8  (3M> 

15 

1984 

Contract 

Reflection 

Coefficients 

Yes 

800 

79.7  (3M) 

.6 

1985 

In-house 

l.ine-spectrum 

Pairs 

No 

800 

87.0  <3M) 

2,3 

3 
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BLOCK  DIAGRAM 


In  our  approach,  an  improved  speech  encoder  is  an  extension  of  the  2400-b/s  LPC.  The  speech 
parameters  are  generated  by  the  standard  2400-b/s  LPC  analyzer.  As  indicated  in  the  block  diagram 
shown  in  Fig.  4,  the  2400-b/s  LPC  may  be  converted  to  an  error-resistant  LPC  by  adding  the  following 
three  computational  modules: 


•  coefficient  converter 


800-b/s  voice  encoder 


•  error  protector 


These  three  blocks  are  discussed  in  the  following  sections. 


Fig.  4  —  Block  diagram  of  dual-mode  narrowband  voice  processor  Addition  of  the  grey  blocks  converts 
the  2400-b/s  LPC  to  a  more  robust  voice  processor  based  on  800-b/s  LPC  (denoted  by  800-b/s) 


Figure  4  is  a  block  diagram  that  shows  the  speech  parameters  used  in  the  generation  of  the  800- 
b/s  bit-stream.  We  recommend  the  use  of  unquantized  speech  parameters  rather  than  a  rate- 
conversion  approach  that  uses  quantized  parameters  since  this  produces  higher  speech  intelligibility 
[14] 


COEFFICIENT  CONVERSION 


The  improved  voice  processor  converts  the  set  of  prediction  coefficients  (PCs)  generated  by  the 
LPC  analysis  into  a  set  of  line  spectrum  pairs  (LSPs),  and  vice  versa  We  present  these  conversion 
algorithms  below. 
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Definition  of  LSP 

The  LPC  analysis  filter  converts  speech  samples  to  prediction  residual  samples.  Since  a  residual 
sample  is  defined  as  the  difference  between  the  input  speech  sample  and  predicted  sample  (i.e.,  the 
sample  estimated  by  a  weighted  sum  of  past  samples),  the  transfer  function  of  the  LPC  analysis  filter 
A(z)  may  be  expressed  as 

4 (z)  =  1  -  a,:1  -  a 2:  2  ~  ■  . .  -  anz~n ,  (1) 

where  an  is  the  n:h  prediction  coefficient.  Prediction  coefficients  are  obtained  by  minimizing  the 
mean-square  value  of  the  prediction  residual,  since  the  LPC  synthesizer  is  the  inverse  of  the  LPC 
analysis  filter,  l/4(r>  Prediction  coefficients  are  convenient  parameters  for  the  LPC 
analysis/ synthesis  because  they  are  obtained  directly  through  the  LPC  analysis.  A  serious  limitation, 
however,  is  that  an  error  in  one  coefficient  affects  the  speech  spectrum  over  the  entire  passband. 

The  LPC  analysis  filter  A  (r)  may  also  be  expressed  in  the  factored  form; 

.4  (z)  =  n  <1-  z,z- -')  (2) 

i  -1 

where  *  is  the  / 1 h  root  of  the  LPC  analysis  filter  < Fig.  5).  The  advantage  of  encoding  roots  is  that  an 
error  in  one  root  affects  the  speech  spectrum  near  that  frequency.  The  roots  of  the  LPC  analysis  filter 
have  never  been  used  as  filler  parameters  because  a  fixed-point  arithmetic  unit  (often  used  in  the 
2400-b/s  LPC)  cannot  successfully  extract  these  roots  from  a  10th  order  polynomial. 
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=  k  ( 1  -  z  z  i 
;  1  i 


Roots  of  A(z) 


Polynomial  Decomposition  to  F.vcn  &  Odd 
functions,  and  Polynomial  Factorizations 


Am  -  I  -I  a  z  =  ( !/2)fP(z)  +  Q(z)| 

i  1 

where  Pm  =  A(z>  -  A*tz)  (Odd) 

Qm  =  Am  +  A*(z)  (Kvcn) 

.  -n-l  -1 

A*m  =  z  A(z  ) 


Line-Spectrum  Pairs 

Roots  of  P(z)  and  Q(z) 
(■  &  •  ) 


(Mirror  linage) 
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To  alleviate  computational  difficulties  in  searching  for  roots  in  a  two-dimensional  space,  the  LPC 
analysis  filter  may  be  decomposer  to  a  sum  of  two  filters  in  which  each  filter  has  roots  along  the  unit 
circle  of  the  complex  z-plane.  This  can  be  accomplished  by  taking  a  sum  and  difference  between  A  (7) 
and  its  conjugate  function  ( i. e . ,  the  transfer  function  of  the  filter  whose  impulse  response  is  a  mirror 
image  of  A  (7): 

P(z)  =  A(,z)  -  z~tn'uA(z  '),  (3) 

and 

Q(z)  =  A(z)  +  z^n  +  uA(z-]).  (4) 

The  LPC  analysis  filter,  reconstructed  by  the  sum  of  these  two  filters,  is 

A(z)  =  y[/>(z)  +  (2(z)l.  (5) 

Equation  (5)  is  an  equivalent  representation  of  the  LPC  analysis  filter  A  (z)  in  which  P(z)  and  Q(z) 
are  component  filters.  We  will  encode  the  parameters  of  P(z)  and  Q(z). 

The  impulse  response  of  P(z)  expressed  by  Eq.  (3)  is  odd  symmetric  with  respect  to  its  midpoint. 
Thus  one  real  root  is  at  2  =  1,  and  other  roots  are  at  2  =  EXP  (j2nfk  ts)  where  fk  is  a  member  of  the 
kth  LSP,  ts  is  the  speech  sampling  time-interval,  and  j  =  V— T.  Thus,  P(z)  may  be  factored  as: 

P(z)  =  (1  -  z-)ff  (1  -  £,2t/*V!)(1  -  T^V1) 

k-\ 

=  (1  -  z~l)  JI  [1  —  2cos  (2nfkts)z~l  +  z-2].  (6) 

k-  1 


On  the  other  hand,  Q(z)  is  even  symmetric  with  respect  to  its  midpoint.  Thus,  one  real  root  is  at 
z  =»  —  1,  and  other  roots  are  at  7  ---  exp  (j2irf‘k).  Thus, 

nil 

Q(z)  =  (1  +  z_1)  [1  -  2  cos  (2-n fkts)z~'  +  z~2],  (7) 

*-1 


where  fk  is  the  other  member  of  the  k th  LSP.  Both  fk  and  f k  are  yet  to  be  determined  when  they  are 
discussed. 


PC-to-LSP  Conversion 

The  conversion  of  PC  to  LSP  consists,  in  essence,  in  finding  the  roots  of  P(z)  and  £?(z).  Since 
the  roots  are  along  the  unit  circle,  they  may  be  found  by  searching  for  null  frequencies  of  the  ampli¬ 
tude  spectra.  Figure  6  shows  the  computational  modules  needed  for  estimating  LSPs.  Since  a  256- 
point  complex  FFT  is  involved,  the  required  computational  load  is  not  trivial.  The  current  CPU  used 
in  the  ANDVT  would  take  approximately  3  ms  (i.e.,  13%  of  the  frame).  To  implement  this  approach, 
a  more  powerful  CPU  would  therefore  be  needed. 


.•-v; 
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Since  the  impulse  responses  of  P(z)  and  Qlz)  are  real,  their  amplitude  spectra  may  be  obtained 
simultaneously  through  the  use  of  a  single  complex  fast  Fourier  transform  (FFT)  [16]  Initially,  the 
impulse  responses  of  P(z)  and  Q(z)  are  loaded  in  the  real  and  imaginary  input  FFT  buffers,  respec¬ 
tively.  Then  the  remaining  244  samples  are  zero-padded  for  Fourier  transform  The  real  and  imaginary 
parts  of  the  output  are  descrambled  to  obtain  the  two  sets  of  amplitude  spectra  [16]  A  transform  size 
of  512  provides  a  frequency  resolution  of  4000  Hz/256  =  15.625  Hz 
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Fig  6  —  Computational  modules  required  for  estimating  LSPs.  For  the  10th  order  LPC,  the  impulse  response 
of  the  LPC  analysis  filter  /4(e)  is  11  samples.  Thus  both  P(z)  and  Q(z)  have  12  samples 


Search  of  Null  Frequencies 

Let  the  amplitude  spectral  components  of  either  P(z)  or  Q(z)  at  frequency  f(j)  be  denoted  by 
y(j),  j  =  1,  2,  . . . ,  256.  The  line  spectrum  in  the  null  frequency  where  the  amplitude  spectrum  is  at  its 
local  minimum.  Thus  three  consecutive  spectral  points,  y(i—l),y(i)  and  y(/+l),  have  the  following 
relationships  near  the  null  frequency  /(/): 


y(i)  <  y(/-l),  for  2  <  /  <  255, 

and 

y(i)  <  y(i+ 1),  for  2  <  /  <  225.  (8) 

Since  the  frequency  resolution  of  the  FFT  is  1 5.625  Hz,  the  error  in  the  estimated  line  spectrum  is  uni¬ 
formly  distributed  between  -7.8125  Hz  and  7.8125  Hz.  The  estimated  line  spectrum,  however,  may  be 
refined  through  a  simple  parabolic  approximation  based  on  the  three  consecutive  spectral  points  (Fig. 
7). 


A  parabola  going  through 
the  three  spectral  points 


Line-spectrum  \ 
location  J 


J 


y(i) 


f(i-t)  ('(')  f(i) 


Frequency 


y(i+i) 


f(i  +  1) 


Fig  7  —  Refinement  of  the  estimated  line-spectral  value  through  a  parabolic  fitting  If  no  fre¬ 
quency  correction  is  made,  f(/>  is  the  estimated  line  spectrum  that  would  have  an  error 
somewhere  between  -7  8125  and  7  81 25  Hz  If  frequency  correction  is  made,  the  estimated 


line  spectrum  is  within  a  few  Hz 
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Substituting  the  three  consecutive  spectral  points  in  the  equation  of  a  parabola,  and  after  finding 
the  solution  for  the  frequency  that  makes  the  gradient  of  the  parabola  zero,  we  will  have  the  refined 
line  spectrum.  Thus 

/'(/)-/(»)  +  !  (9) 

2  y  ( i  -  1 )  -  2_v  ( i )  +  y  ( /  +  1 ) 

where  /'(/)  is  the  refined  line  spectrum. 

Figure  8  shows  a  typical  picture  of  the  LSP  trajectories  from  actual  speech  samples.  As  noted, 
there  are  similarities  between  the  trajectories  of  LSPs  and  speech  resonant  frequencies  because  both  are 
frequency-domain  parameters.  Thus,  an  error  in  one  line  spectrum  affects  the  synthesized  speech  spec¬ 
trum  only  near  that  frequency.  To  exploit  the  listener’s  decreased  sensitivity  to  frequency  differences 
in  the  upper  frequency  region,  we  can  quantize  high-frequency  LSPs  more  coarsely  than  low-frequency 
LSPs.  This  is  a  major  advantage  in  this  approach. 


Thieves  who  rob 


friends 


deserve 


N 

X  4 
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(a)  Spectrogram 


-  .  ,  Time 

'  2  (s) 

(b)  LSP  Trajectories 

Fig  8  —  Typical  LSP  trajectories  and  spectrogram  of  the  original  speech  Since  LSPs  are  located 
near  the  speech  resonant  frequencies,  their  traiectones  are  vers  similar 

LSP-to-PC  Conversion 

The  LSP-to-PC  conversion  is  much  more  straightforward  than  the  PC-to-LSP  conversion.  A  set 
of  LSPs  can  be  converted  to  PC's  by  finding  the  solution  for  the  coefficients  of  the  polynomial  that 
represent  the  transfer  function  of  the  LPC  analysis  filter,  .4(c).  Subslituhng  Lqs  (6)  and  (7)  into  Eq 
(5)  gives 

A  (z )  —  T(1  —  z  1 )  |"J  |1  —  2  cos  (2tt  fk  tt)z  1  +  z  21 

2  *-i 

+  |(l  +  r  ’)  n  (1-2  cos  (2 irf'k  rs  )z~ 1  +  z~ 2 1.  (10) 

2  *  i 
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When  the  product  terms  are  multiplied  out,  the  resultant  polynomial  is  in  the  following  form: 

A(z)  -  1  +  0 ,z~'  +  02z~2  +  ■••  +  0„z~n.  (11) 

Comparing  term  by  term  with  Eq.  (1)  indicates  that  the  / th  prediction  coefficient  is  -/3,  (where 
1  ^  ^  n) . 

800-B/S  VOICE  ENCODER/DECODER 
Bit  Allocations 

According  to  our  experimentation,  the  most  critical  factor  affecting  speech  intelligibility  is  the 
number  of  bits  assigned  to  encode  the  filter  parameters.  Hence  we  encode  both  the  pitch  period  and 
speech  amplitude  parameters  as  coarsely  as  the  ear  can  tolerate.  The  remaining  bits  are  allocated  to 
encode  the  filter  parameters. 

(a)  Pitch  Period 

The  pitch  period  is  encoded  into  five  bits  (12  steps/ octave  with  a  frequency  range  from  66.67  to 
400  Hz).  The  pitch  resolution  is  perceptually  adequate  so  there  will  be  no  impression  of  a  singing 
inflection,  although  the  pitch  is  quantized  to  the  chromatic  equitempered  scale.  Since  the  pitch  does 
not  change  too  radically  in  normal  conversation,  it  is  transmitted  only  once  every  three  frames. 

(b)  Amplitude  Parameter 

The  amplitude  parameter  is  the  root-mean-square  value  of  the  speech  waveform  computed  from 
each  frame  (i.e.,  every  22.5  ms).  The  amplitude  parameter  is  quantized  to  1  to  16  3  dB  steps  and 
transmitted  once  per  each  frame.  In  comparison  with  the  2400-b/s  LPC,  the  resolution  of  the  ampli¬ 
tude  information  is  one  bit  less,  but  casual  listening  cannot  detect  the  difference. 

(c)  Sync  Bit 

Since  the  pitch  period  is  transmitted  once  every  three  frames,  it  is  convenient  to  group  three 
frames,  and  a  sync  bit  is  transmitted  once  for  every  three  frames. 

(d)  Filter  parameters 

The  remaining  12  bits  are  allocated  to  encode  the  filter  parameters  (Table  4)  and  are  transmitted 
once  per  frame.  As  usual,  the  filter  coefficients  are  encoded  jointly  (i.e.,  quantized  vectorially  through 
a  pattern  matching  process).  Such  a  quantization  process  results  in  efficient  coding,  because  the  refer¬ 
ence  filter  parameters  do  not  contain  parameters  from  nonspeech  sounds.  In  this  approach,  the  given 
LSPs  are  compared  with  the  stored  LSP  sets,  and  the  index  corresponding  to  the  best  matching  LSP 
set  is  transmitted.  The  LSP  encoder,  therefore,  has  two  functional  modules:  LSP  template  collection 
and  template  matching.  The  LSP  quantization  process  is  discussed  separately  from  pages  13  through 
17 


Table  4  —  Bit  Allocation  Per  Three  Frames 
for  800-b/s  Voice  Processor 


Synchronization 

1 

Pitch  Period 

5 

Amplitude  Information 

4  +  4  +  4  =  12 

Filter  Parameters  (with  voicing) 

12+12  +  12  =  36 

Total 

54  bits 

12 
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LSP  Template  Collection 

Since  12  bits  are  allowed  to  encode  the  filter  parameters,  we  use  4096  templates  or  patterns  for 
the  LSPs.  Of  the  4096  LSP  templates,  3840  are  for  voiced  speech  and  256  for  unvoiced  speech.  These 
figures  are  based  on  our  experimentation  with  an  800-b/s  voice  processor  that  quantized  reflection 
coefficients  vectorially  (14],  According  to  our  subsequent  experimentation  with  LSPs  as  filter  parame¬ 
ters,  we  have  no  reason  to  change  these  figures. 

Ideally,  each  LSP  template  produces  the  sound  that  is  just  noticeably  different  from  the  closest 
template.  Because  the  human  ear  is  insensitive  to  small  differences  in  the  patterns,  each  LSP  in  a 
given  template  has  an  allowable  frequency  tolerance  (Fig.  9)  within  which  there  is  no  perceptible  sound 
change.  When  each  member  of  the  LSP  set  falls  inside  the  respective  frequency  tolerance  of  a  refer¬ 
ence  LSP  set,  then  the  two  sets  are  treated  as  equal. 


tolerance,  the  synthesized  speech  sounds  no  different  Fk  is  the  Ac  th  line  spectrum  arranged  in  ascending 
order  t ,  <  F,  <  .  .  .  <  Fk  <  . .  .  <  F 10. 

During  template  collection,  we  initially  store  the  first  LSP  set  as  a  reference  template.  Subse¬ 
quently,  we  compare  each  new  LSP  set  with  all  the  stored  reference  LSP  templates.  If  the  new  LSP  set 
falls  outside  the  allowable  frequency  tolerance  for  every  reference  LSP,  then  the  new  LSP  set  becomes 
another  reference  LSP  template  (Fig.  10).  In  this  investigation  we  used  LSP  templates  collected  from 
the  voice  of  54  males  and  12  female  speakers  uttering  five  sentences  each.  During  the  template  collec¬ 
tion,  the  number  of  LSP  sets  that  fell  into  each  template  was  counted.  At  the  end,  the  templates 
representing  the  fewest  sets  were  eliminated  to  reduce  the  total  number  of  templates  to  4096. 

Magnitude  of  LSP  Frequency  Tolerance 

To  utilize  the  4096  LSP  templates  best,  we  have  exploited  both  the  ear’s  insensitivity  to  frequency 
differences  and  the  LSP’s  tolerance  of  spectral  errors. 

(a)  Hearing  Sensitivity  to  Frequency  Differences 

Because  the  ear  cannot  resolve  differences  at  high  frequencies  as  accurately  as  it  does  at  low  fre¬ 
quencies,  we  may  quantize  higher  frequency  LSPs  more  coarsely  than  lower  ones  without  introducing 
audible  speech  degradation.  It  is  well  known  that  the  amount  of  frequency  variation  that  produces  a 
just-noticeable  difference  is  approximately  linear  from  0.1  to  1  kHz,  and  it  increases  logarithmically 
from  1  to  10  kHz  [17],  We  documented  a  similar  relationship  for  speech-like  sounds  using  a  pitch  exci¬ 
tation  signal  with  one  of  the  ten  line  spectra  incrementally  changed  while  all  others  remained  equal 
spaces  (i.e.,  a  resonant-free  condition)  [21.  Figure  11  shows  the  resulting  curve.  We  expect  that  the 
curve  of  actual  speech  sounds  would  be  located  somewhere  between  these  two  curves.  Figure  1 1  indi¬ 
cates  that  the  frequency  difference  allowable  near  4  kHz  can  be  twice  as  large  as  that  near  0  Hz. 
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(b)  Spectral  Sensitivity  of  the  LSP 

v/hen  each  line-spectrum  is  perturbed,  there  is  a  corresponding  spectral  error  in  A(z).  The 
spectral-error  sensitivity  is  a  factor  relating  error  in  each  line-spectrum  (in  Hz)  and  the  average  spectral 
error  of  A  (z)  (in  dB)  To  derive  such  an  expression  from  Eq.  (10),  however,  is  untractable.  Also,  a 
cross-coupling  of  all  line-spectrum  errors  into  the  overall  spectral  error  makes  the  use  of  such  an 
expression  impractical.  Therefore,  we  derived  numerically  a  relationship  that  relates  the  average  spec¬ 
tral  error  of  A  (z)  to  all  of  the  line-spectrum  errors  (hence,  including  the  effect  of  cross-couplings) 
from  various  speech  samples.  There  is  no  approximation  in  computing  the  average  spectral  error  of 
A(z)  from  given  line-spectrum  errors.  However,  we  imposed  the  condition  that  each  line  spectrum 
must  have  an  error  proportional  to  the  frequency  separation  to  its  closest  neighbor  indicated  in  Fig.  9. 
Figure  12  is  a  resultant  scatter  plot.  In  our  judgment,  a  2  dB  average  spectral  error  is  as  big  an  error  as 
we  can  tolerate.  Thus  the  allowable  frequency  tolerance  of  each  line  spectrum  as  obtained  from  Fig.  12 
is  approximately  20%  of  the  frequency  separation  to  its  closes',  neighbor. 

(c)  Allowable  Frequency  Tolerance 

Combining  the  effect  of  the  hearing  sensitivity  to  the  frequency  difference  (Fig.  11)  and  the  spec-, 
tral  sensitivity  of  the  LSP  (Fig.  12),  we  have  an  allowable  frequency  tolerance  for  each  LSP  (see  Fig 
13). 


As  shown  in  Fig.  13,  the  allowable  frequency  tolerance  is  approximately  20.  30,  and  40%  of  the 
frequency  separation  to  the  closest  neighbor  for  line  spectra  located  below  1  kHz,  between  1  and  2  kHz, 
and  above  2  kHz,  respectively.  To  verify  this,  we  listened  to  many  synthesized  speech  samples  while 
perturbing  each  line  spectrum  by  a  given  amount.  Indeed,  we  began  to  notice  some  speech  quality 
degradation  when  the  perturbation  exceeded  the  above-mentioned  tolerance. 

Template  Matching 

The  LSPs  in  each  frame  are  compared  with  all  of  the  LSP  templates,  and  the  index  corresponding 
to  the  closest  match  is  transmitted.  The  template  matching  process  (Fig.  14)  computes  the  distance  to 
each  template,  while  taking  into  account  the  spectral-error  sensitivity  and  the  hearing  sensitivity. 
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Though  an  exhaustive  search  of  4096  templates  would  appear  to  be  a  problem,  our  800  b/s  voice 
encoder  used  earlier  was  able  to  perform  the  task  in  real  time  with  templates  of  ten  reflection  coeffi¬ 
cients  [151  Searching  4096  LSP  templates  should  be  no  problem  by  using  the  current  technology. 

Speech  Intelligibility  vs  Bit-Error  Rate 

f  igure  15  shows  the  intelligibility  of  the  800-b/s  voice  encoder  that  is  discussed  in  this  section 
under  conditions  of  various  bit-error  rates.  Although  bit-errors  may  not  be  random  in  real  environ¬ 
ments.  the  use  of  random  bit-errors  for  testing  purpose  is  helpful  for  determining  the  strengths  and 
weaknesses  of  the  voice  processor  under  investigation.  Also,  we  have  similar  data  from  tests  of  other 
voice  processors  that  allow  us  to  compa.e  and  evaluate. 

The  rate  of  intelligibility  loss  caused  by  the  random  bit-errors  is  nearly  identical  among  different 
voice  processors  operating  at  different  rates  as  we  have  seen.  Figure  15  shows  a  similar  trend.  Thus, 
intelligibility  in  the  error-free  condition  can  be  used  to  predict  the  performance  under  bit  errors.  For 
this  particular  800-b/s  voice  encoder,  the  bit-error  rate  should  be  less  than  approximately  2%  to  ensure 
adequate  intelligibility 
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Fig  15  —  Speech  intelligibility  (DRT  score)  vs  random 
bit-error  rate.  The  800  b/s  voice  encoder  will  be  usable  if 
the  corrected  bit-error  rate  is  limited  to  2% 


ERROR  PROTECTION 

For  this  report,  we  have  investigated  the  potential  advantages  of  providing  error  protection  in  an 
HF  modem  to  all  of  the  data  bits.  This  is  in  contrast  to  the  present  2400-b/s  ANDVT  TACTERM 
(CV-3592),  that  applies  error  protection  to  only  the  24  most  sensitive  bits  in  each  54-bit  LPC  frame 
(see  Table  2)  The  remaining  30  bits  are  transmitted  without  any  error  protection.  To  maintain  as 

much  common  design  between  the  present  2400-b/s  system  and  the  800-b/s  system  presented  in  this 

report,  the  modulation  was  restricted  to  a  four-phase  differential  phase  shift  key  (DPSK)  frequency 
division  multiplex  with  a  frame  rate  identical  to  the  2400-b/s  LPC  (i.e.,  44.444  frames/s)  and  with 
tones  spaced  56.25  Hz  apart. 

Simulated  HF  Channel  and  Signal  Designs 

An  independent  Rayleigh  fading  channel  was  used  to  compare  the  performance  of  the  800-b/s 
system  with  the  ANDVT  TACTERM.  The  comparisons  were  made  by  using  four  different  signal 
designs  for  the  800-b/s  system.  Their  characteristics  were: 

•  800  b/s  transmission  rate  on  9  tones  with  no  coding  or  diversity; 

•  1600  b/s  transmission  rate  on  18  tones  with  dual  diversity; 

•  3200  b/s  transmission  rate  on  36  tones  with  quadruple  diversity, 

•  3200  b/s  transmission  rate  on  36  tones  with  1/2  rate  (24,12)  Golay  coding  on  all  of  the  18 
information  bits  per  frame  and  transmitted  with  dual  diversity.  We  used  soft  decision 
decoding,  identical  to  that  used  in  the  ANDVT  TACTERM. 
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The  independent  Rayleigh  fading  channel  is  a  textbook  channel  It  is  a  transmission  channel  that 
exhibits  fading  with  a  Rayleigh  amplitude  distribution  [18]  with  additive  Gaussian  noise  that  is  indepen¬ 
dent  on  each  of  the  modem  subchannels.  That  is,  there  is  no  correlation  in  the  fading  on  the  different 
modem  tones,  which  is  usually  not  true  on  a  real  HF  channel.  An  independent  Rayleigh  fading  channel 
is  excellent  for  determining  the  potential  advantages  of  diversity  combining  and  for  coding  that  cannot 
be  interleaved  over  many  frames  to  randomize  bursts. 

We  used  Monte-Carlo  simulation  [19]  for  demonstration.  It  consists  of  the  repetitive  generation 
and  demodulation  of  the  received  signal  and  its  reference  signal  for  each  of  the  N  tones  in  a  modem 
frame.  In  a  time-differential  PSK.  system,  the  reference  signal  is  the  signal  detected  during  the  previ¬ 
ous  frame.  It  may  be  represented  by  two  expressions  that  describe  the  in-phase  and  quadrature-phase 
components  that  would  be  obtained  by  correlating  the  received  signal  against  a  locally  generated  signal 
The  received  signal  during  the  present  frame  may  be  represented  by  two  similar  expressions  For  the 
Rayleigh  fading  channel,  the  four  expressions  are: 


In-phase,  reference: 

F,  =  K/?,  cos(<£R]  +  dq)  +  X\ 

(12) 

Quadrature,  reference: 

V2  =  J7/?  isin(<^>  j  +  <t>])  +  Y{ 

(13) 

In-phase,  signal: 

Fj  =  F7?jCOs(4>r.  +  <t>2  +  <f>u)  +  X2 

(14) 

Quadrature,  signal: 

F4  =  F7?2sin((AR2  +  d>2  +  d>D)  +  Y2 

(15) 

where  is  the  reference  phase  shift  (that  was  set  equal  to  zero  in  this  simulation!  and  <b2  is  the  phase 
shift  encoded  in  the  transmitted  signal.  It  was  made  equal  to  n/4  for  all  data  symbols,  which  was 
equated  to  transmitting  an  all  zero  word.  <t>o  is  phase  shift  caused  by  the  doppler  (that  was  also  set  to 
zero  in  the  present  simulation).  The  X  and  Y  values  were  the  in-phase  and  quadrature  components  of 
the  additive  Gaussian  noise.  The  quantity  V  is  a  variable  that  controls  the  signal  energy  to  noise  den¬ 
sity  ratio  expressed  as  the  energy  per  tone  to  noise  density  ratio  E,/NQ): 


A. 

n0 


=  10  log 


dB 


(16) 


The  quantity  £,//Vf,  is  related  to  the  total  signal  energy  to  noise  density  P/N„  by 


A 

Wo 


+  10  log 


jV 

T 


dB 


(17) 


where  V  is  the  number  of  tones  transmitted  and  T  is  the  integration  period  that  is  the  reciprocal  of  the 
tone  spacing 

For  a  Rayleigh  fading  channel  in  which  the  fade  rate  is  slow  compared  to  the  modem  signaling 
rate  <i  e  .  frame  rate),  then, 

/?,  =  /?,  (IX) 

and 

<t>  /< .  —  <t>  k  ,  •  fid) 


which  represents  instantaneous  samples  of  the  channel-induced  amplitude  and  phase  variations  on  the 
received  signal  and  its  reference  In  the  simulation,  each  value  was  obtained  h\  converting  a  random 
variable  with  uniform  distribution  to  a  random  variable  with  a  Gaussian  distribution,  and  then  convert¬ 
ing  that  to  a  variable  with  a  Rayleigh  distribution  [20) 
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Two  Gaussianly  distributed  random  variables  with  zero  mean  and  a  unit  variance,  X  and  Y,  were 
obtained  by 

X  =  -2llnU)lcos(2ir0)  (20) 

Y  =  2[ln(/l )]  sin  (2itB)  (21) 


where  A  and  B  were  variables  randomly  selected  from  a  set  with  uniform  distribution  (0,1).  Likewise, 
a  sample  from  a  Rayleigh  amplitude  distribution  with  a  unit  variance  was  obtained  as 

R  =  0.707  V  X2  +  Y1  (22) 


with  a  phase  angle 


4>  «  tan  —  . 


Demodulation 

The  demodulation  of  the  received  signal  was  performed  to  recover  an  estimate  of  the  transmitted 
information.  The  in-phase  and  quadrature  components  of  the  phase  change  of  the  received  signal  rela¬ 
tive  to  the  reference  signal  were 

/  =  (/,  y,  +  Kj  V4  (24) 

Q  =  V,  v4  -  y2V}  .  (25) 


For  Grey  coded  four-phase  DPSK  with  two  bits  of  information  transmitted  on  each  tone,  the  sign  of  1 
represented  one  information  bit  and  the  sign  of  Q  represented  the  second  bit  of  information.  When 
diversity  combining  was  performed,  the  values  of  /  and  the  values  of  Q  were  added  to  those  of  a  previ¬ 
ous  detection.  Thus,  for  diversity 

/*,  -//  +  /*  (26) 

G/.-Oi  +  Oj  (27) 


and  the  detection  of  the  received  data  is  made  on  the  signs  of  ldlv  and  Qdlv. 

The  soft  decision  decoding  algorithm  [21]  was  based  on  making  up  to  16  separate  trials  at  decod¬ 
ing  each  received  code  word  of  24  bits,  using  all  permutations  of  the  four  bits  with  the  lowest  confi¬ 
dence  The  best  estimate  of  the  correct  data  was  obtained  by  selecting  the  decoding  that  indicated  the 
errors  were  on  the  combination  of  bits  with  the  lowest  overall  confidence.  In  this  simulation  of  a  coded 
system,  separate  code  words  were  assigned  to  the  in-phase  and  quadrature  components,  thus  reducing 
the  possibility  of  multiple  errors  in  a  code  word  when  one  signal  was  severely  faded.  That  is  similar  to 
the  code  assignments  used  in  ANDV'T  TACTERM. 

.Modem  Performance 

Figure  16  shows  the  average  bit  rates  of  the  800-  and  2400-b/s  designs.  They  are  plotted  accord¬ 
ing  to  the  total  signal  energy  to  noise  density  ratio  ( P/N0 ).  Figure  16(a)  clearly  shows  the  advantage  of 
using  dual  diversity  to  provide  an  initial  improvement  of  5  dB  at  a  bit  error  rate  of  1%  followed  by  cod¬ 
ing  with  soft  decision  decoding  to  give  a  total  improvement  of  12  dB  at  1%  error  rate  over  the  straight 
800  b/s  design  In  Fig.  16(b)  the  2400  b/s  design  shows  a  similar  improvement  between  uncoded  and 
coded  voice  data 
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Figure  17  shows  a  comparison  between  the  800-b/s  design  with  coding  and  dual  diversity  (Fig 
16a)  and  2400-b/s  design  with  coding  (Fig.  16b)  As  noted,  the  800-b/s  design  has  a  4-dB  advantage  at 
a  bit  error  rate  of  1  to  2%.  This  is  a  4-dB  advantage  over  the  coded  portion  of  the  2400-b/s  system 
The  other  30  bits  of  the  2400-b/s  system  are  transmitted  uncoded  and  they  contribute  the  intelligibility 
only  under  very  low  bit  error  conditions.  At  high  bit  error  rate  the  30  uncoded  bits  are  a  liability. 


Signal  Finer  jjy  to  Noise  Density  Ratio  (dll) 

fig  P  —  (  omparison  between  800-b/s  design  with  coding  and  dual  diversity  and  2400-b/s  design 
I  he  HOO-b/s  design  has  an  advantage  of  nearly  4  dB  over  the  2400-b/s  design  This  advantage  is 
equivalent  to  receiving  two  and  a  half  times  more  signal  power 


f  igure  18  shows  speech  intelligibility  in  terms  of  the  signal  energy  to  noise  density  ratio  This  Fig¬ 
ure  is  obtained  by  juxtaposing  Fig.  17  (bit-error  rate  vs  P//V0)  and  Fig.  15  (intelligibility  vs  bit-error 
rate)  It  is  significant  that  speech  intelligibility  degrades  from  good  to  poor  with  a  2  dB  reduction  in 
P/.V,,.  When  the  2400-b/s  system  operates  near  the  knee  of  the  performance  curve,  the  use  of  the 
800- b/ s  system  is  much  preferred.  This  report  shows  that  the  usable  range  of  P/,Y0  can  be  extended  by 
nearly  4  dB 


B 


CONCLUSIONS 


ft? 


This  report  discusses  the  result  of  our  efforts  to  improve  voice  communication  in  the  presence  of 
bit  errors  In  particular,  this  impro'ement  is  designed  for  tactical  communicators  who  use  primarily 
narrowband  channels  and  operate  in  congested  platforms  in  close  proximity  to  hostile  forces  We  have 
generated  a  more  robust  voice  coding  algorithm  that  can  be  integrated  into  the  existing  narrowband 
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Moderate 


800  b/s 


2400  b/s 


Signal  Knergy  to  Noise  Density  Ratio  (dB) 

I- ig  IS  -  Speech  intelligibility  v-,  total  signal  energy  to  noise  density  ratio  Note  that  when  the  2400- 
b/s  system  operates  near  the  knee  of  the  performance  curse,  the  use  of  the  K<H)-b/s  design  is  pre¬ 
ferred  In  terms  ol  speech  intelligibility,  the  X()0-h/s  design  has  an  advantage  of  .t  5  dll  over  the 
24(H)-b/s  design 


voice  terminal  so  that  the  communicator  can  select  either  the  DoD-standard  2400-b/s  LPC'  that  is 
interoperable  with  all  narrowband  users  or  the  optional  mode  presented  in  this  report. 

improved  error-resistant  performance  is  obtained  by  removing  speech  redundancies  to  lower  the 
data  rate  from  2400  to  800  b/s,  and  then  introducing  other  redundancies  in  the  form  of  frequency 
diversity  and  coding  to  provide  error  protection.  To  simplify  the  implementation,  we  have  maintained 
the  basic  feature  of  the  ANDVT  in  speech  processing,  error  protection,  and  modem  designs 

We  chose  a  slow  independent  Rayleigh  fading  channel  to  make  a  performance  comparison 
between  the  2400-  and  800-b/s  systems.  The  most  significant  conclusions  follow 

•  The  error  rate  for  the  800-b/s  system  is  one  order  of  magnitude  less  than  that  for  the  2400- 
b/s  system  tor  a  wide  range  of  signal  energy  to  noise  density  ratios  (Fig  17). 

•  For  an  error  rate  of  2%  or  less,  the  800-b/s  system  has  a  4  dB  advantage  in  the  signal  energy 
to  noise  density  ratio  over  the  2400-b/s  system  (Fig  17). 

•  When  the  2400-b/s  system  provides  poor  speech  quality,  the  800-b/s  system  provides  better 
speech  quality  even  when  the  signal  energy  to  noise  density  is  3.5  dB  less  (Fig  18)  In 
other  words,  the  800-b/s  system  behaves  like  the  2400-b/s  system  operating  under  2  5  times 
more  signal  power. 

This  report  represents  an  initial  attempt  to  provide  a  more  robust  performance  for  narrowband 
users,  it  is  intended  to  create  interest  among  DoD  policy  makers,  program  sponsors,  and  system 
designers  We  think  this  approach  is  worthy  of  a  continued  investigation. 
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RECOMMENDATIONS 

Prior  to  committing  prototype  implementation,  we  recommend  the  following  tasks: 

•  The  800-b/s  voice  processor  should  be  programmed  to  run  in  real  time,  not  only  to  allow 
performance  of  additional  tests  but  to  gain  experience  in  generating  the  real-time  software. 

•  The  overall  performance  should  be  further  evaluated  by  using  other  forms  of  channel  distur¬ 
bances. 

•  Efforts  should  be  continued  to  develop  a  voice  processor  capable  of  generating  intelligible 
speech  at  lower  bit  rates. 
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